Speaker Recognition on Single- and Multispeaker Data
نویسندگان
چکیده
We discuss Dragon Systems’ approach to the NIST Speaker Recognition tasks. For the one-speaker task, we employ a combination of methods: a basic GMM system and two LVCSR-based systems, one using standard mixture models and the other using nonparametric techniques. We discuss some explorations of the recently introduced two-speaker tasks based on the GMM system alone. “Cheating” tests using NIST-supplied keys lead us to some improvements in channel normalization, and illuminate the roles that speaker segmentation and segment selection play in these tasks.
منابع مشابه
Unsupervised segmentation and verification of multi-speaker conversational speech
This paper presents our approach to unsupervised multispeaker conversational speech segmentation. Speech segmentation is obtained in two steps that employ different techniques. The first step performs a preliminary segmentation of the conversation analyzing fixed length slices, and assumes the presence in every slice of one or two speakers. The second step clusters the segments obtained by the ...
متن کاملThe challenge of multispeaker lip-reading
In speech recognition, the problem of speaker variability has been well studied. Common approaches to dealing with it include normalising for a speaker’s vocal tract length and learning a linear transform that moves the speaker-independent models closer to to a new speaker. In pure lip-reading (no audio) the problem has been less well studied. Results are often presented that are based on speak...
متن کاملThe Speakers in the Wild (SITW) Speaker Recognition Database
The Speakers in the Wild (SITW) speaker recognition database contains hand-annotated speech samples from open-source media for the purpose of benchmarking text-independent speaker recognition technology on single and multi-speaker audio acquired across unconstrained or “wild” conditions. The database consists of recordings of 299 speakers, with an average of eight different sessions per person....
متن کاملThe Approach of Mean Shift based Cosine Dissimilarity for Multi-Recording Speaker Clustering
Speaker clustering is an important task in many applications such as Speaker Diarization as well as Speech Recognition. Speaker clustering can be done within a single multispeaker recording (Diarization) or for a set of different recordings. In this work we are interested by the former case and we propose a simple iterative Mean Shift (MS) algorithm. MS algorithm is based on Euclidean distance....
متن کاملSpeaker Change Detection using Support Vector Machines
Speaker change detection is important for automatic segmentation of multispeaker speech data into homogeneous segments with each segment containing the data of one speaker only. Existing approaches for speaker change detection are based on the dissimilarity of the distributions of the data before and after a speaker change point. In this paper, we propose a classification based technique for sp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Digital Signal Processing
دوره 10 شماره
صفحات -
تاریخ انتشار 2000